Quadratic Programming Feature Selection

نویسندگان

  • Irene Rodríguez-Luján
  • Ramón Huerta
  • Charles Elkan
  • Carlos Santa Cruz
چکیده

Identifying a subset of features that preserves classification accuracy is a problem of growing importance, because of the increasing size and dimensionality of real-world data sets. We propose a new feature selection method, named Quadratic Programming Feature Selection (QPFS), that reduces the task to a quadratic optimization problem. In order to limit the computational complexity of solving the optimization problem, QPFS uses the Nyström method for approximate matrix diagonalization. QPFS is thus capable of dealing with very large data sets, for which the use of other methods is computationally expensive. In experiments with small and medium data sets, the QPFS method leads to classification accuracy similar to that of other successful techniques. For large data sets, QPFS is superior in terms of computational efficiency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scaling-Up Quadratic Programming Feature Selection

Domains such as vision, bioinformatics, web search and web rankings involve datasets where number of features is very large. Feature selection is commonly employed to deal with high dimensional data. Recently, Quadratic Programming Feature Selection (QPFS) has been shown to outperform many of the existing feature selection methods for a variety of datasets. In this paper, we propose a Sequentia...

متن کامل

Feature Selection with Complexity Measure in a Quadratic Programming Setting

Feature selection is a topic of growing interest mainly due to the increasing amount of information, being an essential task in many machine learning problems with high dimensional data. The selection of a subset of relevant features help to reduce the complexity of the problem and the building of robust learning models. This work presents an adaptation of a recent quadratic programming feature...

متن کامل

A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

A novel linear feature selection algorithm is presented based on the global minimization of a data-dependent generalization error bound. Feature selection and scaling algorithms often lead to non-convex optimization problems, which in many previous approaches were addressed through gradient descent procedures that can only guarantee convergence to a local minimum. We propose an alternative appr...

متن کامل

Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria

This paper presents a comprehensive analysis of multicollinearity problem in data fitting. Data fitting is stated as a single-objective optimization problem where an objective function indicates the error of approximation the target vector with a some function of given features. The linear dependence between features means that the multicollinerity problem exists and leads to unstability and re...

متن کامل

Fast Feature Selection from Microarray Expression Data via Multiplicative Large Margin Algorithms

New feature selection algorithms for linear threshold functions are described which combine backward elimination with an adaptive regularization method. This makes them particularly suitable to the classification of microarray expression data, where the goal is to obtain accurate rules depending on few genes only. Our algorithms are fast and easy to implement, since they center on an incrementa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2010